Model Selection

MoE Architecture

# MoE Architecture

Qwen3 30B A3B Llamafile

Qwen3 is the latest generation of large language models in the Qwen series, offering a range of dense and mixture-of-experts (MoE) models. Based on extensive training, Qwen3 has achieved groundbreaking progress in reasoning, instruction following, agent capabilities, and multilingual support.

Large Language Model

Qwen3 14B Base Unsloth Bnb 4bit

Qwen3-14B-Base is the latest generation large language model in the Qwen series, offering a dense model with 14.8 billion parameters, supporting a context length of 32k and covering 119 languages.

Large Language Model

Qwen3-4B-Base is the latest generation of the Qwen series with 4 billion parameters, supporting 32k context length and multilingual processing.

Large Language Model

The latest generation large language model in the Qwen series, offering a 14.8 billion parameter pre-trained base model with support for 32k ultra-long context understanding

Large Language Model

Deepseek R1 Zero

DeepSeek-R1 is the first-generation reasoning model developed by DeepSeek, trained through reinforcement learning, excelling in mathematics, coding, and reasoning tasks.

Large Language Model

Granite 3.1 1b A400m Base

Granite-3.1-1B-A400M-Base is a language model developed by IBM. Through a progressive training strategy, the context length is extended from 4K to 128K, supporting multilingual and various text processing tasks.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase